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Abstract 

It is well-known that freeness and linearity information positively interact with alias- 
ing information, allowing both the precision and the efficiency of the sharing analysis 
of logic programs to be improved. In this paper we present a novel combination of set- 
sharing with freeness and linearity information, which is characterized by an improved 
abstract unification operator. We provide a new abstraction function and prove the cor- 
rectness of the analysis for both the finite tree and the rational tree cases. Moreover, we 
show that the same notion of redundant information as identified in | |Bagnara et al. 2002| 
l^affancU a et al. 2002t also applies to this abstract domain combination: this allows for 
the implementation of an abstract unification operator running in polynomial time and 
achieving the same precision on all the considered observable properties. 

KEYWORDS: Abstract Interpretation; Logic Programming; Abstract Unification; Ratio- 
nal Trees; Set-Sharing; Freeness; Linearity. 



1 Introduction 

Even though the set-sharing domain is, in a sense, remarkably precise, more preci- 
sion is attainable by combining it with other domains. In particular, freeness and 
linearity information has received much attention by the literature on sharing anal- 
ysis (recall that a variable is said to be free if it is not bound to a non- variable term; 
it is linear if it is not bound to a term containing multiple occurrences of another 
variable). 
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and "Automatic Aggregate- and Number-Reasoning for Computing: from Decision Algorithms 
to Constraint Programming with Multisets, Sets, and Maps"; by the Integrated Action Italy- 
Spain "Advanced Development Environments for Logic Programs" ; by the University of Parma's 
FIL scientiflc research project (ex 60%) "Pure and applied mathematics"; and by the UK's 
Engineering and Physical Sciences Research Council (EPSRC) under grant M05645. 
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As argued informally by S0ndergaard ( |S0ndergaard 1986| ), the mutual interaction 
between linearity and aliasing information can improve the accuracy of a sharing 
analysis. This observation has been formally applied in l|Codish et al. 199T)l to the 
specification of the abstract mgu operator for the domain ASub. In his PhD the- 
sis ( |Langen 1990| ), Langen proposed a similar integration with linearity, but for the 
set-sharing domain. He has also shown how the aliasing information allows to com- 
pute freeness with a good degree of accuracy (however, freeness information was 
not exploited to improve aliasing). King f King 1994| ) has also shown how a more 
refined tracking of linearity allows for further precision improvements. 

The synergy attainable from a bi-directional interaction between aliasing and 
freeness information was initially pointed out by Muthukumar and Hermenegildo 
dMuthukumar and Hermenegildo 1991||Muthukumar and Hermenegildo 1992| ). Since 
then, several authors considered the integration of set-sharing with freeness, some- 
times also including additional explicit structural information IjCodish et al. 19931 
l( Modish et al. 1 9961 lFTieT994l [King and Soner 19941. 

Building on the results obtained in (.S0ndcrgaard 19'86| ) , l|Codish et al. 199111 and ( [Muthukumar and Hermenegild 
but independently from ( [Langen 1990[ ), Hans and Winkler (|Hans and Winkler 1992|l 
proposed a combined integration of freeness and linearity information with set- 
sharing. Similar combinations have been proposed in ( [Bruynooghe and Codish 1993[ 
[Bruynooghe et al. 1994a] [Bruynooghe et al. 1994b| ) . From a more pragmatic point 
of view, Codish et al. ([Codish et al. 1998llCodish et al. 1995|l integrate the informa- 
tion captured by the domains of ( [S0ndergaard 1986| ) and ( [Muthukumar and Hermenegildo 1991[ ) 
by performing the analysis with both domains at the same time, exchanging infor- 
mation between the two components at each step. 

Most of the above proposals differ in the carrier of the underlying abstract do- 
main. Even when considering the simplest domain combinations where explicit 
structural information is ignored, there is no general consensus on the specifica- 
tion of the abstract unification procedure. From a theoretical point of view, once 
the abstract domain has been related to the concrete one by means of a Galois 
connection, it is always possible to specify the best correct approximation of each 
operator of the concrete semantics. However, empirical observations suggest that 
sub-optimal operators are likely to result in better complexity /precision trade-offs 
gnara et al. 2000) ). As a consequence, it is almost impossible to identify "the 
right combination" of variable aliasing with freeness and linearity information, at 
least when practical issues, such as the complexity of the abstract unification pro- 
cedure, are taken into account. 

Given this state of affairs, we will now consider a domain combination whose 
carrier is essentially the same as specified by Langen ( [Langen 1990| ) and Hans and 
Winkler ([Hans and Winkler 1992"jl . (The same domain combination was also con- 
sidered by Bruynooghe et al. fBruyno oghe et al. 1994a| [Bruynooghe et al. 1994b| ), 
but with the addition of compoundness and explicit structural information.) The 
novelty of our proposal lies in the specification of an improved abstract unification 
procedure, better exploiting the interaction between sharing and linearity. As a 
matter of fact, we provide an example showing that all previous approaches to the 
combination of set-sharing with freeness and linearity are not uniformly more pre- 
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cise than the analysis based on the ASub domain l|Codish et al. 19911 [King 2000| 
|S0ndergaard 19861 ), whereas such a property is enjoyed by our proposal. 

By extending the results of l|Hill et al. 2002|l to this combination, we provide a 
new abstraction function that can be applied to any logic language computing on 
domains of syntactic structures, with or without the occurs-check; by using this 
abstraction function, we also prove the correctness of the new abstract unification 
procedure. Moreover, we show that the same notion of redundant information as 
identified in ( |Bagnara et al. 2002| IZaffanella et al. 2002|l also applies to this ab- 
stract domain combination. As a consequence, it is possible to implement an algo- 
rithm for abstract unification running in polynomial time and still obtain the same 
precision on all the considered observables: groundness, independence, freeness and 
linearity. 

This paper is based on (|Zaffanella 20011 Chapter 6), the PhD thesis of the second 
author. In Sectional we define some notation and recall the basic concepts used later 
in the paper. In Section 13 we present the domain SFL that integrates set-sharing, 
freeness and linearity. In Section ^ we show that SFL is uniformly more precise 
than the domain ASub, whereas all the previous proposals for a domain integrating 
set-sharing and linearity fail to satisfy such a property. In Section |31 we show 
that the domain SFL can be simplified by removing some redundant information. 
In Section El we provide an experimental evaluation using the China analyzer 
dBagnara 1997| ). In Section [3 we discuss some related work. Section |S1 concludes 
with some final remarks. The proofs of the results stated here are not included but 
all of them are available in an extended version of this paper IIHill et al. 200311 . 

2 Preliminaries 

For a set S, p{S) is the powerset of S. The cardinality of S is denoted by #5 and 
the empty set is denoted by 0. The notation p{{S) stands for the set of all the 
finite subsets of S, while the notation 5 Cf T stands for S £ pi{T). The set of all 
finite sequences of elements of S is denoted by S* , the empty sequence by e, and 
the concatenation of si, S2 G S* is denoted by si . S2- 

2.1 Terms and Trees 

Let Sig denote a possibly infinite set of function symbols, ranked over the set of 
natural numbers. Let Vars denote a denumerable set of variables, disjoint from Sig. 
Then Terms denotes the free algebra of all (possibly infinite) terms in the signature 
Sig having variables in Vars. Thus a term can be seen as an ordered labeled tree, 
possibly having some infinite paths and possibly containing variables: every inner 
node is labeled with a function symbol in Sig with a rank matching the number of 
the node's immediate descendants, whereas every leaf is labeled by either a variable 
in Vars or a function symbol in Sig having rank (a constant). It is assumed that 
Sig contains at least two distinct function symbols, with one of them having rank 0. 
If t e Terms then vars(t) and mvars(i) denote the set and the multiset of variables 
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occurring in t, respectively. We will also write vars(o) to denote the set of variables 
occurring in an arbitrary syntactic object o. 

Suppose s,t G Terms: s and t are independent if vars(s) fl vars(t) = 0; we 
say that variable y occurs linearly in t, more briefly written using the predication 
occ_lin(2/, i), if y occurs exactly once in mvars(f); t is said to be ground if vars(f) = 0; 
t is free if t € Vars; t is linear if, for all y G vars(t), we have occ_lin(y,t); finally, 
t is a finite term (or Herbrand term.) if it contains a finite number of occurrences 
of function symbols. The sets of all ground, linear and finite terms are denoted by 
G Terms, LTerms and HTerms, respectively. 




2.2 Substitutions 

A substitution is a total function a: Vars HTerms that is the identity almost 
everywhere; in other words, the domain of a, 

dom((T) {x G Vars | cr{x) ^ x^, 

is finite. Given a substitution a: Vars HTerms, we overload the symbol 'ct' so 
as to denote also the function a: HTerms — > HTerms defined as follows, for each 
term t G HTerms: 

if t is a constant symbol; 
a{t) =^ \ cr{t), iitG Vars: 

.,a{tn)), ift = f{ti,...,t„). 

If t € HTerms, wc write ta to denote (j{t). Note that, for each substitution a and 
each finite term t G HTerms, if ta G Vars, then t G Vars. 

If a; G Vars and t G HTerms \ {x}, then x t is called a binding. The set of all 
bindings is denoted by Bind. Substitutions are denoted by the set of their bindings, 
thus a substitution a is identified with the (finite) set 

{x xa \ X € dom(cr) } . 

We denote by va,is{a) the set of variables occurring in the bindings of a. We also 

define range((T) "^^^ 1J{ vaTs(xa) | x G doui(cr) }. 

A substitution is said to be circular if, for n > 1, it has the form 

\^Xi I ^ X2, . . ' , ^n— 1 ' ^ *^n? ' ^ ^^l}) 

where xi, . . . , x„ are distinct variables. A substitution is in rational solved form 
if it has no circular subset. The set of all substitutions in rational solved form is 
denoted by RSubst. A substitution a is idempotent if, for all t G Terms, we have 
taa = ta. Equivalently, a is idempotent if and only if dom((j) n range(tT) = 0. The 
set of all idempotent substitutions is denoted by ISubst and ISubst C RSubst. 

The composition of substitutions is defined in the usual way. Thus r o a is the 
substitution such that, for all terms t G HTerms, 



t{T o a) = tar 
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and has the formulation 

T o a = { a; I— > xar | x G dom(CT) U dom(T), x ^ xot } . (1) 

As usual, cr'^ denotes the identity function (i.e., the empty substitution) and, when 
I > 0, cr' denotes the substitution (a o a'^~^^. 

For each a G RSubst and s G HTerms, the sequence of finite terms 

converges to a (possibly infinite) term, denoted a°°{s) ( |Intrigila and Zilli 1996| 
[King 2000| ). Therefore, the function rt: HTerms x RSubst Terms such that 

rt(s,(T) =^a^{s) 

is well defined. Note that, in general, this function is not a substitution: while having 
a finite domain, its "bindings" x i— > rt(a::,CT) can map a domain variable x into a 
term rt(a;, a) G Terms \ HTerms. However, as the name of the function suggests, 
the term rt(a;, cr) is granted to be rational, meaning that it can only have a finite 
number of distinct subterms and hence, be finitely represented. 

Example 1 

Consider the substitutions 

CTi = {x ^ f{z), y t— > a} G ISubst, 
a2 — {x t-^ f{y),y i-^ a} G RSubst \ ISubst, 
CT3 = {x t-^ f{x)} G RSubst \ ISubst, 

CT4 = {x f{y),y ^ /(a;)} G RSubst \ ISubst, 
— {x ^ y ,y ^ x^ ^ RSubst. 

Note that there are substitutions, such as (72, that are not idempotent and nonethe- 
less define finite trees only; namely, vi[x,a2) — f{a). Similarly, there are other 
substitutions, such as (74, whose bindings are not explicitly cyclic and nonetheless 
define rational trees that arc infinite; namely, rt(a:, 0^4) = /(/(/(• • • ))). Finally note 
that the 'rt' function is not defined on (jz. RSubst. 

2.3 Equality Theories 

An equation is of the form s — t where s,t ^ HTerms. Eqs denotes the set of all 
equations. A substitution a may be regarded as a finite set of equations, that is, as 
the set {x = t| (xi— »i)Gcr}. We say that a set of equations e is in rational solved 
form ifjsi— >i| (s = t)Ge}G RSubst. In the rest of the paper, we will often 
write a substitution a G RSubst to denote a set of equations in rational solved form 
(and vice versa). As is common in research work involving equality, we overload the 
symbol and use it to denote both equality and to represent syntactic identity. 
The context makes it clear what is intended. 

Let {r, s, t,si, . . . , Sn, ti, . . . , tn} C HTerms. We assume that any equality theory 
T over Terms includes the congruence axioms denoted by the following schemata: 
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s = t ^ t = s, (3) 

r^sAs^t^r^t, (4) 

Si ^ tl A ■ ■ ■ A Sn ^ tn ^ f{si, . . . , S„) = f{ti, . . . , t„). (5) 



In logic programming and most implementations of Prolog it is usual to assume an 
equality theory based on syntactic identity. This consists of the congruence axioms 
together with the identity axioms denoted by the following schemata, where / and 
g are distinct function symbols or n 7^ m: 

/(si, . . . , s„) = f{ti, t„) -> si = A • • • A s„ = i„, (6) 
-.(/(si, . . . ,s„) = g{ti, . . . ,t„0). (7) 

The axioms characterized by schemata © and (0) ensure the equality theory de- 
pends only on the syntax. The equality theory for a non-syntactic domain replaces 
these axioms by ones that depend instead on the semantics of the domain and, in 
particular, on the interpretation given to functor symbols. 

The equality theory of Clark ^Clark 1978|l . denoted TT, on which pure logic 
programming is based, usually called the Herbrand equality theory, is given by the 
congruence axioms, the identity axioms, and the axiom schema 

Vz G Vars : E {HTerms \ Vars) : z G vars(t) — > = t). (8) 

Axioms characterized by the schema ^ are called the occurs- check axioms and are 
an essential part of the standard unification procedure in SLD-resolution. 

An alternative approach used in some implementations of logic programming 
systems, such as Prolog II, SICStus and Oz, does not require the occurs-check 
axioms. This approach is based on the theory of rational trees IjColmerauer 19821 
IColmerauer 1984|) . denoted UT. It assumes the congruence axioms and the identity 
axioms together with a uniqueness axiom for each substitution in rational solved 
form. Informally speaking these state that, after assigning a ground rational tree to 
each variable which is not in the domain, the substitution uniquely defines a ground 
rational tree for each of its domain variables. Note that being in rational solved form 
is a very weak property. Indeed, unification algorithms returning a set of equations 
in rational solved form are allowed to be much more "lazy" than one would expect. 
We refer the interested reader to IjJaffar et al. 19871 IKeisu 19941 IMaher 1988|l for 
details on the subject. 

In the sequel we use the expression "equality theory" to denote any consistent, 
decidable theory T satisfying the congruence axioms. We also use the expression 
"syntactic equality theory" to denote any equality theory T also satisfying the 
identity axioms. 

We say that a substitution a G RSubst is satisfiable in an equality theory T if, 
when interpreting a as an equation system in rational solved form, 

T h V( Vars \ dom(cr)) : 3 dom(cr) . a. 

Let e G p{{Eqs) be a set of equations in an equality theory T. A substitution 
a G RSubst is called a solution for e in T ii a is satisfiable in T and T \~ V((T —>■ e); 
we say that e is satisfiable if it has a solution. If vars(cr) C vars(e), then a is said 
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to be a relevant solution for e. In addition, cr is a most general solution for e in T 
if T h V(cr <-» e). In this paper, a most general solution is always a relevant solution 
of e. When the theory T is clear from the context, the set of all the relevant most 
general solutions for e in T is denoted by mgs(e). 

Example 2 

Let e= {g{x) = g{f{y))J{x) =y,z = g{w)} and 

CT = {x f{y),y f{x),z 1-^ g{w)}. 

Then, for any syntactic equality theory T. we have T h V((t ^ e). Since a G RSubst, 
then a and hence e is satisfiable in TZT. Intuitively, whatever rational tree t^ 
is assigned to the parameter variable w, there exist rational trees tx, ty and t^ 
that, when assigned to the domain variables x, y and z, will turn a into a set of 
trivial identities; namely, let tx and ty be both equal to the infinite rational tree 
/(/(/(•••))), which is usually denoted by f^, and let tz be the rational tree g{tw)- 
Thus a is a relevant most general solution for e in TZT. In contrast, 

T = {x^ f{y), y H^. f{x), z H^. g{f{a))} 

is just a relevant solution for e in IZT. Also observe that, for any equality theory 
T, 

T^y[a^{x = f{f{x))}) 

so that a docs not satisfy the occurs-check axioms. Therefore, neither a nor e are 
satisfiable in the Herbrand equality theory J^T. Intuitively, there is no finite tree 
tx such that tx = f{f{tx))- 

We have the following useful result regarding 'rt' and satisfiable substitutions 
that are equivalent with respect to any given syntactic equality theory. 

Proposition 3 

Let (T, T G RSuhst be satisfiable in the syntactic equality theory T and suppose that 
rhV(c7<-»r). Then 

rt(t/,(T) G Vars <=4> rt(i/,T) G Vars, (9) 
rt(j/,(T) G GTerms rt(t/,T) G GTerms, (10) 

it{y,a) e LTerms <s=^ vt{y,T) € LTerms. (11) 

2.4 Galois Connections and Upper Closure Operators 

Given two complete lattices (C, <c) and {A, <a), a Galois connection is a pair of 
monotonic functions a: C ^ A and 7 : A ^ C such that 

Vc e C : c <c 7(q!(c)) , Va e ^ : a(7(a)) <a a. 

The functions a and 7 are said to be the abstraction and concretization functions, 
respectively. A Galois insertion is a Galois connection where the concretization 
function 7 is injective. 

An upper closure operator (uco) p: C — > C on the complete lattice (C, <c) is a 
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monotonic, idempotent and extensive^ self-map. The set of all uco's on C, denoted 
by uco(C), is itself a complete lattice. For any p e uco(C), the set p(C), i.e., the 
image under p of the lattice carrier, is a complete lattice under the same partial 

dcf 

order <c defined on C . Given a Galois connection, the function p — 7 o a is an 
element of uco(C). The presentation of abstract interpretation in terms of Galois 
connections can be rephrased by using uco's. In particular, the partial order [Z 
defined on uco(C) formalizes the intuition of an abstract domain being more precise 
than another one; moreover, given two elements pi,p2 G uco(C), their reduced 
product IjCousot and Cousot 1979|l . denoted pi n p2, is their gib on uco(C). 

2.5 The Set-Sharing Domain 

The set-sharing domain of Jacobs and Langen ( [Jacobs and Langen 1989) ), encodes 
both aliasing and groundness information. Let VI Cf Vars be a fixed and finite set 
of variables of interest. An element of the set-sharing domain (a sharing set) is a 
set of subsets of VI (the sharing groups). Note that the empty set is not a sharing 
group. 

Definition 4 

(The set-sharing lattice.) Let SG p( VI) \ {0} be the set of sharing groups. 
The set-sharing lattice is defined as SH p{SG), ordered by subset inclusion. 

The following operators on SH are needed for the specification of the abstract 
semantics. 

Definition 5 

(Auxiliary operators on SH.) For each sh, shi, s/12 G SH and each V C VI, we 
define the following functions: 

the star-union function (•)* : SH — > SH , is defined as 

sh* = {S e SG \ 3n>l . 3Si, . . . , Sn e sh . S ^ Si U ■ ■ ■ U Sn}; 

the extraction of the relevant component of sh with respect to V is encoded by 
rel: p{VI) x SH ^ SH defined as 

re\{V, sh) = {Sesh\SnVj^0 }; 
the irrelevant component of sh with respect to V is thus defined as 

^\{V,sh) =^ sh\Tel{V,sh); 
the binary union function bin : SH x SH —^ SH is defined as 
bin(s/ii, s/12) =^ { 5*1 U 52 \ Si 6 shi,S2 G s/12 }; 

^ Namely, c <c p{c) for each c S C 
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the self-bin-union operation on SH is defined as 

sh^ dof g^-j. 

the abstract existential quantification function aexists : SH x p( VI) —>■ SH is defined 
as 

aexists(s/i, V) '^^^^ { S \V \ S e sh, S \V ^ } U { {x} \ x e V }. 

In |Ba enara et al. 19971 |Ba gnara et al. 2002| ) it was shown that the domain SH 
contains many elements that are redundant for the computation of the actual ob- 
servable properties of the analysis, definite groundness and definite independence. 
The following formalization of these observables is a rewording of the definitions 
provided in (Za ffanella et al. 19991 IZaffanella et al. 20n2|l . 

Definition 6 

(The observables of SH.) The groundness and independence observables (on 
SH) pcomPps G uco{SH) are defined, for each sh e SH, by 

pc„„{sh) =^ {S <E SG \ S C vars(s/i) }, 

p^,{sh) =^ { 5 e I (P C 5 A #F = 2) =^ {3T e sh . P CT)}. 

Note that, as usual in sharing analysis domains, definite groundness and definite in- 
dependence are both represented by encoding possible non-groundness and possible 
pair-sharing information. 

The abstract domain PSD | |Bagnara et al. 2002) IZafi^anella et al. 2002|l is the 
simplest abstraction of the domain SH that still preserves the same precision on 
groundness and independence. 

Definition 7 

(The pair-sharing dependency lattice PSD.) The operator ppso G uco{SH) is 
defined, for each sh G SH, by 

p^snish) =^ I S* e 5*0 Vy e S : S = [j{U e sh\y <eU C S}y 

The pair-sharing dependency lattice is PSD Ppsd{SH). 

In the following example we provide an intuitive interpretation of the approxi- 
mation induced by the three upper closure operators of Definitions El and 

Example 8 

Let VI = {v, w, X, y, z} and consider'^ sh — {vx, vy, xy, xyz}. Then 

Pc„„{sh) = {v, vx, vxy, vxyz, vxz, vy, vyz, vz, x, xy, xyz, xz, y, yz, z}, 
pps{sh) = {v, vx, vxy, vy, w, x, xy, xyz, xz, y, yz, z}, 



^ In this and all the following examples, we will adopt a simplified notation for a set- 
sharing element sh, omitting inner braces. For instance, we will write {xy, xz, yz} to denote 
{{x,y},{x,z},{y,z}}. 
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ppsoish) = {vx,vxy,vy,xy,xyz}. 

When observing Pconish), the only information available is that variable w does not 
occur in a sharing group; intuitively, this means that w is definitely ground. All the 
other information encoded in sh is lost; for instance, in sh variables v and z never 
occur in the same sharing group (i.e., they are definitely independent), while this 
happens in pc„n{sh). 

When observing pps{sh), it should be noted that two distinct variables occur in 
the same sharing group if and only if they were also occurring together in a sharing 
group of sh, so that the definite independence information is preserved (e.g., v and 
z keep their independence). On the other hand, all the variables in VI occur as 
singletons in pps{sh) whether or not they are known to be ground; for instance, 
{w} occurs in pps{sh) although w does not occur in any sharing group in sh. 

By noting that ppsoish) C pc„n{sh) C] pps{sh), it follows that ppsoish) preserves 
both the definite groundness and the definite independence information of sh; more- 
over, as the inclusion is strict, ppsoish) encodes other information, such as variable 
covering (the interested reader is referred to ( [Bagnara et al. 2002|IZaffanella et al. 2002|l 
for a more formal discussion). 

2.6 Variable- 1 dempotent Substitutions 

One of the key concepts used in (jHill et al. 2003|l for the proofs of the correct- 
ness results stated in this paper is that of variable-idempotence. For the interested 
reader, we provide here a brief introduction to variable-idempotent substitutions, 
although these are not referred to elsewhere in the paper. 

The definition of idempotence requires that repeated applications of a substitu- 
tion do not change the syntactic structure of a term and idempotent substitutions 
are normally the preferred form of a solution to a set of equations. However, in 
the domain of rational trees, a set of solvable equations does not necessarily have 
an idempotent solution (for instance, in Example |21 the set of equations e has no 
idempotent solution) . On the other hand, several abstractions of terms, such as the 
ones commonly used for sharing analysis, are only interested in the set of variables 
occurring in a term and not in the concrete structure that contains them. Thus, 
for applications such as sharing analysis, a useful way to relax the definition of 
idempotence is to ignore the structure of terms and just require that the repeated 
application of a substitution leaves the set of variables in a term invariant. 

Definition 9 

(Variable-idempotence.) A substitution a G RSuhst is variable-idempotent^ if 
and only if for all t G HTerms we have 

vars(iCTCT) = vars(ia). 

^ This definition, which is the same as that originally provided in IHill et al. 19981 . is slightly 
stronger than the one adopted in (Hill ct al. 20021, which disregarded the domain variables of 
the substitution. The adoption of this stronger definition allows for some simplifications in the 
correctness proofs for freeness and linearity. 
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The set of variable-idempotent substitutions is denoted VSubst. 

As any idempotent substitution is also variable-idempotent, we have ISubst C 
VSuhst C RSubst. 

Example 10 

Consider the following substitutions which are all in RSubst. 

(Ti = {x /(y)} G ISubst C VSubst, 

^2 = {x i-^ f{x)^ G VSubst \ ISubst, 

(T3^ {x ^ f{y,z),y^ f{z,y)} £ VSubst \ ISubst, 
cr4 = {x ^ y,y'-^ f{x,y)} (^VSubst. 

3 The Domain SFL 

The abstract domain SFL is made up of three components, providing different kinds 
of sharing information regarding the set of variables of interest VI: the first compo- 
nent is the set-sharing domain SH of Jacobs and Langen ^Jacobs and Langen 1989| ); 
the other two components provide freeness and linearity information, each repre- 
sented by simply recording those variables of interest that are known to enjoy the 
corresponding property. 

Definition 11 

(The domain SFL.) Let F p{VI) and L p{VI) be partially ordered by 
reverse subset inclusion. The abstract domain SFL is defined as 

SFL =^ { {sh, f, l)\sheSH,f eF,leL] 

and is ordered by <s, the component- wise extension of the orderings defined on the 
sub-domains. With this ordering, SFL is a complete lattice whose least upper bound 
operation is denoted by alubg. The bottom element (0, VI , VI) will be denoted by 

3.1 The Abstraction Function 

When the concrete domain is based on the theory of finite trees, idempotent sub- 
stitutions provide a finitely computable strong normal form for domain elements, 
meaning that different substitutions describe different sets of finite trees.'' In con- 
trast, when working on a concrete domain based on the theory of rational trees, 
substitutions in rational solved form, while being finitely computable, no longer 
satisfy this property: there can be an infinite set of substitutions in rational solved 
form all describing the same set of rational trees (i.e., the same element in the 
"intended" semantics). For instance, the substitutions 

n 

On = {x^ /(• ■■ !{x) •••)}, 
^ As usual, this is modulo the possible renaming of variables. 
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for n = 1, 2, . . . , all map the variable x into the same infinite rational tree f^. 

Ideally, a strong normal form for the set of rational trees described by a substitu- 
tion a G RSubst can be obtained by computing the limit (7°°. The problem is that 
a°° can map domain variables to infinite rational terms and may not be in RSubst. 

This poses a non-trivial problem when trying to define "good" abstraction func- 
tions, since it would be really desirable for this function to map any two equivalent 
concrete elements to the same abstract element. As shown in l|Hill et al. 2002|l . 
the classical abstraction function for set-sharing analysis l|Cortesi and File 19991 
[Jacobs and Langen 1989D , which was defined only for substitutions that are idem- 
potent, does not enjoy this property when applied, as it is, to arbitrary substitutions 
in rational solved form. In (jHill et al. 19981 iHill et al. 2002|l . this problem is solved 
by replacing the sharing group operator 'sg' of ( [Jacobs and Langen 1989| | by an 
occurrence operator, 'occ', defined by means of a fixpoint computation. However, 
to simplify the presentation, here we define 'occ' directly by exploiting the fact that 
the number of iterations needed to reach the fixpoint is bounded by the number of 
bindings in the substitution. 

Definition 12 

(Occurrence operator.) For each a e RSubst and v € Vars, the occurrence 
operator occ: RSubst x Vars —y p{{Vars) is defined as 

occ(cr, v) ^= {y Vars | n = ^cr, v G vars(y(T") \ dom{a) }. 

For each a G RSubst, the operator ssets: RSubst SH is defined as 

ssets(cr) =^ { occ(cr, v) H VI \ V ^ Vars } \ {0}. 
The operator 'ssets' is introduced for notational convenience only. 

Example 13 
Let 

CT = {a;i t-^ f{x2),X2 1-^ g{x3,X4),X3 ^ xi}, 

T = {a;i t-^ f{g{x3, X4)),X2 ^ g(x3, 2:4), ^3 ^ /(.g(x3, 0:4))} . 

Then dom(cr) = dom(T) ~ {xi, X2, a^s} so that occ(ct, Xi) = occ(t, Xi) ~ 0, for i = 1, 
2, 3 and occ(cr, X4) — occ(t, X4) = {xi, X2,X3,X4}. As a consequence, supposing that 
VI — {xi,X2,xz,X4}, we obtain ssets(CT) = ssets(T) = {V7}. 

In a similar way, it is possible to define suitable operators for groundness, freeness 
and linearity. As all ground trees are linear, a knowledge of the definite groundness 
information can be useful for proving properties concerning the linearity abstrac- 
tion. Groundness is already encoded in the abstraction for set-sharing provided in 
Definition 1121 nonetheless, for both a simplified notation and a clearer intuitive 
reading, we now explicitly define the set of variables that are associated to ground 
trees by a substitution in RSubst. 
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Definition 14 

(Groundness operator.) The groundness operator gvars: RSubst pf{Vars) is 
defined, for each a G RSubst, by 

gvars(cr) =' { J/ e dom(a) | \/v € Vars : y ^ occ(cr, v) } . 

Example 15 

Consider a e RSubst where 

a = {xi l-> X2,X2 <->■ f{a),X3 l-> Xi,X4, !->■ f{X2,X4)}. 

Then gvars(cr) = {xi,X2,X3,X4}. Observe that xi € gvars(ij) although xia e Vars. 
Also, xs G gvars(cr) although vars(a;30-') = {x2,X4} ^ for all i>2. 

As for possible sharing, the definite freeness information can be extracted from 
a substitution in rational solved form by observing the result of a bounded number 
of applications of the substitution. 

Definition 16 

(Freeness operator.) The freeness operator fvars: RSubst — > p{Vars) is defined, 
for each a € RSubst, by 

fvars((7) =^ { y e Vars \ n = i^a, ya"' G Vars }. 

As a e RSubst has no circular subset, y e fvars(o-) implies j/a" G Vars \ dom(ij). 

Example 17 

Let VI = {xi,X2,X3,X4,X5} and consider a G RSubst where 

a = {xi X2,X2 <->■ f{X3),X3 X4,X4 X5}. 

Then fvars(cr) Ci VI = {x3,X4,X5}. Thus xi ^ fvars(cr) although xia S Vars. Also, 
0:3 e fvars(cr) although xsa e dom(cr). 

As in previous cases, the definite Unearity information can be extracted by observ- 
ing the result of a bounded number of applications of the considered substitution. 

Definition 18 

(Linearity operator.) The linearity operator Ivars: RSubst p{Vars) is defined, 
for each ct e RSubst, by 

Ivars(CT) =^ { y € Vars | n = #cr,'^z S vars(ya") \ dom(<T) : occ Jin(^;, ycr^") }. 

In the next example we consider the extraction of linearity from two substitutions. 
The substitution a shows that, in contrast with the case of set-sharing and freeness, 

for linearity wc may need to compute up to 2n appHcations. where n = ^cr; the 
substitution r shows that, when observing the term yr^", multiple occurrences of 
domain variables have to be disregarded. 
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Example 19 

Let VI = {xi, a;2, xa, X4} and consider a G RSubst where 

Cr = {xi X2,X2 X3,X3 f{xi,X4)}. 

Then Ivars(cr) n VI ^ {2^4}- Observe that xi ^ Ivars(CT). This is because X4 ^ 
dom(CT), xia^ — f{xi,X4) so that X4 G va,i-s{xia^) and xia^ = f[f{xi,X4),X4) 
so that occJin(a;4, xifT^) does not hold. Note also that occJin(a;4, xicr*) holds for 
i = 3, 4, 5. 

Consider now r G RSubst where 

T = {a;i f{x2,X2),X2 1-^ f{x2)}- 

Then Ivars(T) D VI ~ VI. Note that we have xi G Ivars(r) although, for all i > 0, 
X2 G doni(T) occurs more than once in the term xir*. 

The occurrence, groundness, freeness and linearity operators are invariant with 
respect to substitutions that are equivalent in the given syntactic equality theory. 

Proposition 20 

Let cr, r G RSubst be satisfiable in the syntactic equality theory T and suppose that 
rhV(f7^T). Then 

ssets(a-) = ssets(T), (12) 

gvars(cr) = gvars(T), (13) 

fvars((T) — fvars(T), (14) 

Ivars(cr) — Ivars(T). (15) 

Moreover, these operators precisely capture the intended properties over the do- 
main of rational trees. 

Proposition 21 

If <T G RSubst and y, w G Vars then 

2/Gocc(cr, u) V G vars(rt(j/, cr)) , (16) 

y G gvars((7) rt(j/,(7) G GTerms, (17) 

y G f vars (cr) rt(7y,cr)G Vars, (18) 

y G Ivars(cr) rt(j/, cr) G LTerms. (19) 

It follows from (|16(l and H18|l that any free variable necessarily shares (at least, with 
itself). Also, as Vars U GTerms C LTerms, it follows from (jTZIl, CHIl and that 
any variable that is either ground or free is also necessarily linear. Thus we have 
the following corollary. 

Corollary 22 

If cr G RSubst, then 

fvars(cr) C vars(ssets(cr)) , 
fvars(cr) Ugvars(cr) C Ivars(a). 

We are now in position to define the abstraction function mapping rational trees 
to elements of the domain SFL. 
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Definition 23 

(The abstraction function for SFL.) For each substitution a G RSubst, the 
function as : RSubst — )■ SFL is defined by 

as(cr) =^ (ssets(o-), fvars(cr) fl V7,lvars(cr) fl Vl), 

The concrete domain p{RSubst) is related to SFL by means of the abstraction 
function as - p{RSubst) —> SFL such that, for each E G p{RSubst), 

as(S) alubs{ cks(cr) | cr G S }. 

Since the abstraction function as is additive, the concretization function is given 
by the adjoint l|Cousot and Cousot 1977|l 

-fsiish, f,l)) {a G RSubst | ssets((T) C sh,iva.rs{(j) D /, Ivars(a) ^l}. 

With Definition 1231 and Proposition 1201 one of our objectives is fulfilled: substi- 
tutions in RSubst that are equivalent have the same abstraction. 

Corollary 24 

Let (7, T G RSubst be satisfiable in the syntactic equality theory T and suppose 
T h- V((T ^ r). Then as{a') = as{T). 

Observe that the Galois connection defined by the functions as and 7s is not a 
Galois insertion since different abstract elements are mapped by 7s to the same 
set of concrete computation states. To see this it is sufficient to observe that, by 
Corollary 1221 any abstract element d — {sh, f, I) G SFL such that / ^ vars(s/i), as 
is the case for the bottom element ±s, satisfies 7s (rf) = 7s(-Ls) = 0; thus, all such 
d's will represent the semantics of those program fragments that have no successful 
computations. Similarly, by letting V — ( V7 \ vars(s/i)) U/, it can be seen that, for 
any /' such that VUl = FUZ', we have, again by Corollarv l22l 7s((i) = jsiish, /, I')). 

Of course, by taking the abstract domain as the subset of SFL that is the co- 
domain of as, we would have a Galois insertion. However, apart from the simple 
cases shown above, it is somehow difficult to explicitly characterize such a set. For 
instance, as observed in IjFile 1994|l . if 

d = {{xy, xz, yz}, {x, y, z}, {x, y, z}) G SFL 

we have 7s('i) = 7s(-Ls) — 0- ft is worth stressing that these "spurious" elements 
do not compromise the correctness of the analysis and, although they can affect 
the precision of the analysis, they rarely occur in practice ( [Bagnara et al. 2000| 
Eaffanclla 2001,1 . 

3.2 The Abstract Operators 

The specification of the abstract unification operator on the domain SFL is rather 
complex, since it is based on a very detailed case analysis. To achieve some mod- 
ularity, that will be also useful when proving its correctness, in the next definition 
we introduce several auxiliary abstract operators. 
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Definition 25 

(Auxiliary operators in SFL.) Let s,t G HTerms be finite terms such that 
vars(s) U vars(t) C VI. For each d = {sh, f, I) G SFL we define the following 

predicates: 

s and t are independent in d if and only if ind<j: HTerms^ Bool holds for {s,t), 
where 

mddis,t) =^ ^rel(vars(s), s/i) Pi rel(vars(t), s/i) = 0^; 

t is ground in d if and only if ground^ : HTerms Bool holds for t, where 

ground^ (f) =^ (vars(t) C V7 \ vars(s/t)) ; 

y E vars(f) occurs linearly (in t) in d if and only if occJin^ : VI x HTerms —>■ Bool 
holds for {y,t), where 

occJ.ma{y,t) = groundrf(?/) V \ occJ.m{y,t) A{y el) 

A V2; G vars(t) : {y z => indd(y, z))^ ; 

t is free in d if and only if frce^i : HTerms — > Bool holds for t, where 

freest) {tef); 

t is linear in d if and only if lin^ : HTerms — > Bool holds for t, where 

lind(t) =W?/ e vars(i) : occ_lind(y, t). 

The function share_with(j : HTerms — > p{ VI) yields the set of variables of interest 
that may share with the given term. For each t e HTerms, 

share_withd(t) vars^rel(vars(t), sh)^. 

The function cyclic* : SH — > SH strengthens the sharing set sh by forcing the 
coupling of X with t. For each sh € SH and each (a; 1— > t) € Bind, 

cyclic* (s/i) =^ rel({a;} U vars(f), sh) U rel(vars(t) \ {a;}, sh). 

As a first correctness result, we have that the auxiliary operators correctly ap- 
proximate the corresponding concrete properties. 

Theorem 26 

Let d G SFL, a G 7s (rf) and y G VI. Let also s,t £ HTerms be two finite terms 



such that vars(s) U vars(f) C VI. Then 

mdd{s,t) => vars(rt(s, cr)) nvars(rt(i,c7)) = 0; (20) 

mdd{y,t) <S=^ y ^ share_withd(t); (21) 

freed (t) =^ n{t,a) G Vars; (22) 

groundrf(t) =4> rt{t,a) e GTerms; (23) 

lind(t) => rt(t, tr) G iTerms. (24) 



Correct and Efficient Integration of Set-Sharing, Freeness and Linearity 17 



Example 27 

Let VI = {v,w,x,y,z} and consider the abstract element d = {sh,f,l) £ SFL, 
where 

sh = {v,wz,xz,z}, f^{v}, I = {v,x,y,z}. 

Then, by applying Definition 1251 we obtain the following. 

• ground^ (x) does not hold whereas ground^^ holds. 

• freed (u) holds but free^ does not hold. 

• Both indd{w, x) and indd{f{w,y),f{x,y)) hold whereas mdd{x,z) does not 
hold; note that, in the second case, the two arguments of the predicate do 
share ?/, but this does not affect the independence of the corresponding terms, 
because y is definitely ground in the abstract element d. 

• Let t = f{w,XjX,y,y,z); then occAmd{w,t) does not hold because w ^ I; 
occ-lin^ (x, t) does not hold because x occurs more than once in t; occAindiy, t) 
holds, even though y occurs twice in i, because y is definitely ground in d; 
occjindizyt) does not hold because both x and z occur in term t and, as 
observed in the point above, iiidd{x,z) does not hold. 

• For the reasons given in the point above, Vmd(t) does not hold; in contrast, 
lind(/(2/, y, z)) holds. 

• share.withd (ui) = {w,z} and share_withd(a:) = {x,z}; thus, both w and x 
may share one or more variables with z; since we observed that w and x are 
definitely independent in d, this means that the set of variables that w shares 
with z is disjoint from the set of variables that x shares with z. 

• Let t = f{w, z); then 

cyclic* (s/i) = reldui, z}, sh) U reldui}, sh) 
= {v} U {wz} 
= s/i \ {xz, z}. 

An intuitive explanation of the usefulness of this operator is deferred until 
after the introduction of the abstract mgu operator (see also Example 

We now introduce the abstract mgu operator, specifying how a single binding 
affects each component of the domain SFL in the context of a syntactic equality 
theory T. 

Definition 28 

(amgUg.) The function amgu^ : SFLx Bind — > SFL captures the effects of a binding 

on an element of SFL. Let d = {sh, f, I) E SFL and {x i—> t) G Bind, where 
{x} U vars(i) C VI. Let also 

sh' cyclic^ (s/i_ U sh"), 

where 

shx rel({a;}, s/i), sht rel(vars(t), s/i), 

shxt shx n sht, sh- =^ rel({x} U vars(i), sh). 
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sh" 



hm^shx ,sht), if free^ (x) V free^ (t) ; 
bin(s/ia; U hm{shx, sh*^^), 

dcf sht U bin(s/it, s/i*t)), if lind(a;) A lind(t); 

bin(s/i*, s/it), iflind(a;); 

bin(s/i:c, s/ij ), iflind(i); 

bin(s/i*, s/i( ), otherwise. 

Letting 5'^: '= share_withd(x) and St =^ share_withd(t), we also define 



/' 



f\Su 

J\{Sa:U St), Otherwise; 
r ='(W\vars(5/i'))U/'U/", 



dcf 



if freed (a;) A heed{t); 
if freed (x); 
if freed (t); 



where 



I" 



dcf 



i\is,nSt), 
i\s.,, 

l\St, 
[l\{S,USt), 



if lind(a;) A lind(t); 
if lind(a;); 
if lind(t); 
otherwise. 



Then 



amgUs(d, ; 



±s, it d = ±s y {T ^ A X e vars(t)); 

{sh',f',l') otherwise. 



The next result states that the abstract mgu operator is a correct approximation 
of the concrete one. 

Theorem 29 

Let d e SFL and {x >^ t) e Bind, where {x} U vars(t) C VI. Then, for all 
a G "ys{d) and r £ nigs(cr U {a; = t}) in the syntactic equality theory T, we 
have r e 7^ (amgUs((i, a; i— > t)). 

We now highlight the similarities and differences of the operator amgUg with 
respect to the corresponding ones defined in the "classical" proposals for the inte- 
gration of set-sharing with freeness and linearity, such as ( |Bruynooghe et al. 1994a| 
IBruyno'og he et al. 199^ IHans and Winkler 19921 |Langen 1990| ). Note that, when 
comparing our domain with the proposal in jBruynooghe et al. 1994a| l, we delib- 
erately ignore all those enhancements that depend on properties that cannot be 
represented in SFL (i.e., compoundness and explicit structural information). 

• In the computation of the set-sharing component, the main difference can be 
observed in the second, third and fourth cases of the definition of sh": here 
we omit one of the star-unions even when the terms x and t possibly share. In 
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contrast, in ( |Bruynooghe et al. 1994a|lHans and Winkler 1992l|Langen 1990| ) 

the corresponding star-union is avoided only when mdd{x, t) holds. Note that 
when indd(a;, t) holds in the second case of sh" , then we have sh^t — 0; thus, 
the whole computation for this case reduces to sh" = hm{shx, sht), as was 
the case in the previous proposals. 

• Another improvement on the set-sharing component can be observed in the 
definition of sh': the cyclic^ operator allows the set-sharing description to be 
further enhanced when dealing with explicitly cyclic bindings, i.e., when x G 
vars(i). This is the rewording of a similar enhancement proposed in ( |Bagnara 1997| ) 
for the domain Pos in the context of groundness analysis. Its net effect is to 
recover some groundness and sharing dependencies that would have been un- 
necessarily lost when using the standard operators. When x ^ vars(i), we 
have cyclic^ (s/i_ U sh") = s/i_ U sh" . 

• The computation of the freeness component /' is the same as specified in | |Eruynooghe et al. 1994a| 
IHans and Winkler 1992|l , and is more precise than the one defined in fLangen 19901 • 

• The computation of the linearity component V is the same as specified in ( Bruynooghe et al. 1994a| ), 
and is more precise than those defined in IjHans and Winkler 1992l|Langen 1990| ). 

In the following examples we show that the improvements in the abstract com- 
putation of the sharing component allow, in particular cases, to derive better infor- 
mation than that obtainable by using the classical abstract unification operators. 

Example 30 

Let VI = {x, xi,X2, y, yi,y2, z} and a e RSubst such that 

cr =^ [x ^ f{xi,X2,z),y ^ /(yi, z, ys)}- 

By Definition 1231 we have d as{{<j}) — {sh, f, I), where 

sh = {xxi,xx2,xyz,yyi,yy2}, f^VI\{x,y}, l^VI. 

Consider the binding (a; i— > y) G Bind. In the concrete domain, we compute (a 
substitution equivalent to) r G mgs((T U = y}), where 

r = {x f{yi,y2,V2),y ^ f{yi,y2,y2),xi yi,x2 ^y2,z^ ys}- 

Note that q;s({t}) = {shr, /tjI-t), where shr = {xxiyyi,xx2yy2z}, so that the 
pairs of variables Px = {2^1,2:2} and Py = {yi,y2} keep their independence. 

When evaluating the sharing component of amgu^ [d, x ^ y), using the notation 
of Definition |2H1 wc have 

shx = {xxi,xx2,xyz}, sht = {xyz,yyi,yy2}, 

shxt = {xyz}, sh^ = 0. 

Since both \iiid{x) and lind(y) hold, we apply the second case of the definition of 
sh" so that 

{xxi,xxiyz, XX2, xx2yz, xyz}, 
{xyyiz, xyy2Z, xyz, yyi,yy2}, 
bini^shx U hm{shx, sh*^t), sht U bin(s/it, shxt)) 



shx U hm{shx, sh^t) — 
sht U bin(sftt, s/i*j) = 
sh" — 
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= {xxiyyi^xxiuuiz, xxiyy2, xxiyy2Z, xxiyz, 
xx2yyi,xx2yyiz, xx2yy2,xx2yy2z, xx2yz, 
xyyiz,xyy2Z,xyz}. 

Finally, as the binding is not cyclic, we obtain sh' — sh" . Thus amgUg captures the 
fact that pairs Px and Py keep their independence. 

In contrast, since u\dd{x,y) does not hold, all of the classical definitions of ab- 
stract unification would have required the star-closure of both shx and s/it, re- 
sulting in an abstract element including, among the others, the sharing group 
S — {x,xi,X2,y,yi,y2}- Since PxU Py C S, this independence information would 
have been unnecessarily lost. 

Similar examples can be devised for the third and fourth cases of the definition 
of sh" , where only one side of the binding is known to be linear. The next example 
shows the precision improvements arising from the use of the cyclic^ operator. 

Example 31 

Let VI — {x,xi,X2,y} and a '= {^x ^ f{xi,X2)"\. By Definition 1231 we have 
d'^= as{{(TY) = {sh, f,l), where 

sh = {xxi, XX2, y}, f — \ {2;}, I = VI. 

Let t — f{x,y) and consider the cyclic binding (cc i—> G Bind. In the concrete 
domain, we compute (a substitution equivalent to) r e mgs((T U {x = t}), where 

T = [x^ f{xi,X2),Xi f-^ f{Xl,X2),y^-^ X2, }. 

Note that if we further instantiate r by grounding y, then variables x, xi and 
X2 would become ground too. Formally we have as{{T}) = {shr, Jt^t), where 
shr = {xxiX2y}. Thus, as observed above, y covers x, xi and X2. When abstractly 
evaluating the binding, wc compute 

shx = {xXi,XX2}, sht = {xxi,xx2,y}, 

shxt — shx, sh^ = 0. 

Since both Ymd{x) and lind(i) hold, we apply the second case of the definition of 
sh" , so that 

shx U h\T\{shx, sh*^t) = sh* = {xxi, xa;ia;2, cca;2}, 
sht U hm{sht, sh*f.) = {xxi, xxiX2,xxiX2y, xxiy, XX2, xX2y, y}, 

sh" — hm(^shx U hm{shx, sh*^^), sht U bin(s/it, sh*t)) 
— {xxi, XX1X2, xxiX2y , xxiy, xx2, xx2y}. 
Thus, as X G vars(i), we obtain 

sh' = cyclic^ (s/i_ U sh") 

= rel({a::} U vars(t), sh") U rel(vars(t) \ {x}, sh") 

= 0Ure\{{y},sh") 

= {xxiX2y,xxiy,xx2y}. 



Correct and Ejficient Integration of Set- Sharing, Freeness and Linearity 21 

Note that, in the element sh- U sh" — sh" (which is the abstract element that 
would have been computed when not exploiting the cyclic^ operator) variable y 
covers none of variables x, xi and X2. Thus, by applying the cyclic^ operator, this 
covering information is restored. 

The full abstract unification operator aunifyg , capturing the effect of a sequence 
of bindings on an abstract element, can now be specified by a straightforward 
inductive definition using the operator amgUg. 

Definition 32 

(aunifyj,.) The operator aunifyg : SFL x Bind* SFL is defined, for each d G SFL 
and each sequence of bindings bs £ Bind* , by 



Note that the second argument of aunify^ is a sequence of bindings (i.e., it is not a 
substitution, which is a set of bindings), because amgu^ is neither commutative nor 
idempotent, so that the multiplicity and the actual order of application of the bind- 
ings can influence the overall result of the abstract computation. The correctness 
of the aunify^ operator is simply inherited from the correctness of the underlying 
amgUg operator. In particular, any reordering of the bindings in the sequence 6s 
still results in a correct implementation of aunify^. 

The 'merge-over-all-path' operator on the domain SFL is provided by alubg and 
is correct by definition. Finally, we define the abstract existential quantification 
operator for the domain SFL, whose correctness does not pose any problem. 

Definition 33 

(aexistSs.) The function aexistSs : SFL x pt(V7) SFL provides the abstract ex- 
istential quantification of an element with respect to a subset of the variables of 
interest. For each d =^ {sh, /, I) G SFL and V C VI, 

aexistSs ((s/i, /, I), V) '^^ (aexists(s/i, V), J yjV,lyjV) . 

The intuition behind the definition of the abstract operator aexistSg is the fol- 
lowing. As explained in Section |21 any substitution a G RSubst can be interpreted, 
under the given equality theory T , as a first-order logical formula; thus, for each 
set of variables V , it is possible to consider the (concrete) existential quantification 
3V . a. The goal of the abstract operator aexistSs is to provide a correct approxi- 
mation of such a quantification starting from any correct approximation for a. 

Example 34 

Let VI = {x,y,z} and a ^ {x ^ f(vi,V2),v ^ g{v2,vz),z f{vi,vi)}, so that, 
by Definition 1231 



Let V — {y, z} and consider the concrete element corresponding to the logical 
formula 3V . a. Note that T h V(r ^ 3V . a), where r = {x i— > f{vi,V2)}- By 



aunifys(d, bs) 



dcf 



d, if bs — e; 

aunifyg(amgUs(d, a; > i), &s'), if bs = {x ^ t) . bs' . 



d — a. 



's{W}) = {{xy,xz,y},0,{x,y}). 
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applying Definition 1331 we obtain 

aexistss((i, V) = {{x,y,z},{y,z},{x,y,z}) = as{{T}). 

It is worth stressing that such an operator does not affect the set VI of the variables 
of interest. In particular, the abstract element aexistSs{d,V) still has to provide 
correct information about variables y and z. Intuitively, since all the occurrences 
of y and z in 3V . a are bound by the existential quantifier, the two variables of 
interest are un-aliased, free and linear. 

Note that an abstract projection operator, i.e., an operator that actually modifies 
the set of variables of interest, is easily specified by composing the operator aexistSs 
with an operator that simply removes, from all the components of SFL and from 
the set of variables of interest VI, those variables that have to be projected out. 



4 A Formal Comparison Between SFL and ASub 

As we have already observed, Example 1201 shows that the abstract domain SFL, 
when equipped with the abstract mgu operator introduced in Section f3.2l can yield 
results that are strictly more precise than all the classical combinations of set- 
sharing with freeness and linearity information. In this section we show that the 
same example has another interesting, unexpected consequence, since it can be used 
to formally prove that all the classical combinations of set-sharing with freeness and 
linearity, including those presented in ( [Bagnara et al. 20'00||Bruynooghe et al. 1994a| 
IHans and Winkler 19921 [Langen 1990| ), are not uniformly more precise than the ab- 
stract domain ASub ( |S0ndergaard 1986| ), which is based on pair-sharing. 

To formalize the above observation, we now introduce the ASub domain and 
the corresponding abstract semantics operators as specified in IjCodish et al. 199T)l . 
The elements of the abstract domain ASub have two components: the first one is 
a set of variables that are known to be definitely ground; the second one encodes 
both possible pair-sharing and possible non-linearity into a single relation defined 
on the set of variables. Intuitively, when x ^ y and {x,y) G VI^ occurs in the 
second component, then x and y may share a variable; when {x, x) G VI^ occurs 
in the second component, then x may be non-linear. The second component always 
encodes a symmetric relation; thus, for notational convenience and without any loss 
of generality ( [King 2000| ), we will represent each pair {x, y) in such a relation as the 
sharing group S = {x,y}, which will have cardinality 1 or 2 depending on whether 
X — y or not, respectively. 

Definition 35 

(The domain ASub^.) The abstract domain ASubj, is defined as ASubx '= 
{J-ASub} U ASub, where 



ASub =^ \ {G, R) e p( VI) X SH 



G n vars(i?^) = 0, 
yS e R:l <#S <2 



For i e {1, 2}, let = (G;, i?,) G ASub. Then 



Kl ^ASub K2 <^ Gi D G2 A i?i C i?2. 



Correct and Ejficient Integration of Set-Sharing, Freeness and Linearity 23 



The partial order ^ASub is extended on ASub^ by letting J-ASub be the bottom element. 

Let u,v € VI and k — {G,R) £ ASub. Then u v is a shorthand for the 
condition {u, v} G R, whereas u 4=^ w is a shorthand for u — v V {u, v} G R. 

It is well-known that the domain ASubj^ can be obtained by a further abstraction 
of any domain such as SFL that is based on set-sharing and enhanced with linearity 
information. The following definition formalizes this abstraction. 

Definition 36 

(aASub : SFL ASub_L.) Let d = (sh, f, I) e SFL. Then 



aAs„b(rf) '= 



±ASyb, if = J-s; 

(G, i?), otherwise; 



where 



G =^ { x G VI \ X ^ vars(s/i) }, 

R {{x} C VI \ xe vars(s/i) AxU} 

u{{x,y}C VI \ X j^y A3S € sh . {x,y} C S}. 

The definition of abstract unification in (|Codish et al. 199T)l is based on a few 
auxiliary operators. The first of these introduces the concept of abstract multiplicity 
for a term under a given abstract substitution, therefore modeling the notion of 
definite groundness and definite linearity. 

Definition 37 

(Abstract multiplicity.) Let k — (G, R) G ASub and let t G HTerms be a term 
such that vars(i) C VI. We say that y G vars(i) occurs linearly (in t) m k if and 
only if occJiuK : VI x HTerms Bool holds for {y,t), where 



occ. 



IhiKiy, =' y G G V (occJin(?;, t) A^z e vars(i) : {y, z} ^ R). 



We say that t has abstract multiplicity m in k if and only if Xnit) = rn, where 
Xk ■ HTerms {0, 1, 2} is defined as follows 

{0, if vars(t) C G; 
1, if yy G vars(t) : occ_linK(y, t); 
2, otherwise. 

For any binding x t, the function Xk ■ Bind ^{0}U{I,2}^is defined as follows 



/ N dcf 

Xk{x ^t) ^ 



0, if Xk{x) = or xAt) ^ 0; 

^{xKix),XKit)), otherwise; 

It is worth noting that, modulo a few insignificant differences in notation, the multi- 
plicity operator Xk defined above corresponds to the abstract multiplicity operator 
X"^, which was introduced in fCodish c t al. 19911 Definition 3.4) and provided with 
an executable specification in ( |King 2000| Definition 4.3). Similarly, the next defi- 
nition corresponds to IjCodish et al. 19911 Definition 4.3). 
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Definition 38 

(Sharing caused by an abstract equation.) For each k G ASub and [x ^ t) ^ 

Bind, where Vx — {x} and Vt = vars(i) are such that U Vt C VI, the function 
soln: ASub x Bind ASub is defined as follows 



soln(K, X 



dcf 



{VxUVt,0), 


if X«(a^ 


-t) = 


0; 


{0,hiiiiVx,Vt)), 


if Xii{x ^- 


-t) = 


(1,1); 


< (0,bin(T4,V;uFt)), 


if Xii{x ^ 


-t) = 


(1,2); 


{0,hmiVxU Vt,Vt)), 


if Xii{x ^ 


-t) = 


(2,1); 


^{0,hiiiiVxUVt,VxUVt)), 


if Xii{x ^- 


-t) = 


(2,2); 



where the function bin: p( VI)^ SH, for each V,W ^ VI, is defined as follows 

hm{V, W^) ^'^ { {u, w} C VI \ v gV,w eW}. 
The next definition corresponds to l|Codish et al. 19911 Definition 4.5). 
Definition 39 

(Abstract composition.) Let h,k' G ASub, where k = {G,R) and n' — {G',R'). 
Then kok' =^ {G" , R"), where 

dcf 



R" 



dcf 



{u,v} G 5*77 



n G" = 0, 
(u ■i-^ u) V (3a;, y . u 



We are now ready to define the abstract mgu operator for the domain ASub^. This 
operator can be viewed as a specialization of IjCodish et al. 19911 Definition 4.6) for 
the case when we have to abstract a single binding. 



Definition 40 

(Abstract mgu for ASub^.) Let k G ASub^ and (x 

vars(i) C VI. Then 



t) G Bind, where {x} U 



amgu^s„,(K,x t-^ t) 



dcf 



if, 



\no soln(K, X t— > t), otherwise; 

By repeating the abstract computation of Example 1301 on the domain ASub, we 
provide a formal proof that all the classical approaches based on set-sharing are 
not uniformly more precise than the pair-sharing domain ASub. 

Example 41 

Consider the substitutions cr, r G RSubst and the abstract element d G SFL as 
introduced in Example I3UI 

By Definition 1361 we obtain k = a^sub{d) = (0, R), where 

R = {xxi,xx2,xy, xz, yyi, yy2,yz}. 

When abstractly evaluating the binding x i—> y according to Definition 1401 we 
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compute the following: 

Xf^ix 1-^ y) 

soln(K, X i-^ y) 

where 

R" = RU {a;, xyi,xy2,xiy, xiyi, 112/2, xiz, X2y, X2yi,X2y2, X2Z, y, yiz, y2Z, z}. 

Note that {xi^X2} ^ R" and {yi,?/2} ^ R" ^ so that these pairs of variables 
keep their independence. In contrast, as observed in Example 1301 the operators 
in HBagnara et al. 2000||Bruynooghe et al. 1994a|lHans and Winkler 1992l|Langen l"990| | 

will fail to preserve the independence of these pairs. 

We now show that the abstract domain SFL^ when equipped with the opera- 
tors introduced in Section 13.21 is uniformly more precise than the domain ASub. 
In particular, the following theorem states that the abstract operator amgu^ of 
Definition 1281 is uniformly more precise than the abstract operator amgu^j^^. 

Theorem ^2 

Let d G SFL and n £ ASub^ be such that a^sub{d) r^ASub Let also (a; G Bind, 
where {a;} U vars(t) C VI. Then 

aASub(amgUs(d,a; ^ t)) ^^sub amgu^s„,(K,x 1-^ t). 

Similar results can be stated for the other abstract operators, such as the abstract 
existential quantification aexistSs and the merge-over-all-path operator alubg. It 
is worth stressing that, when sequences of bindings come into play, the specifi- 
cation provided in IjCodish et al. 19911 Definition 4.7) requires that the ground- 
ing bindings (i.e., those bindings such that Xk{x i) = 0) are evaluated be- 
fore the non-grounding ones. Clearly, if we want to lift the result of Theorem 1421 
so that it also applies to the operator aunify^, the same evaluation strategy has 
to be adopted when computing on the domain SFL; this improvement is well- 
known ( |Langen 1990| pp. 66-67) and already exploited in most implementations of 
sharing analysis ( [Bagnara et al. 2000| ). 

5 SFL2: Eliminating Redundancies 

As done in ( |Bagnara et al. 2002| IZaffanella et al. 2002|l for the plain set-sharing 
domain SH, even when considering the richer domain SFL it is natural to question 
whether it contains redundancies with respect to the computation of the observable 
properties. 

It is worth stressing that the results presented in ( |Bagnara et al. 2002| | and IjZaffanella et al. 2n02|l 
cannot be simply inherited by the new domain. The concept of "redundancy" de- 
pends on both the starting domain and the given observables: in the SFL domain 
both of these have changed. First of all, as can be seen by looking at the definition 
of amgUg, freeness and linearity positively interact in the computation of sharing 



= (1,1), 

— Ko soln(K, X ^ y) ~ (0, i?"). 
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information: a priori it is an open issue whether or not the "redundant" sharing 
groups can play a role in such an interaction. Secondly, since freeness and linearity 
information can be themselves usefully exploited in a number of applications of 
static analysis (e.g., in the optimized implementation of concrete unification or in 
occurs-check reduction), these properties have to be included in the observables. 

We will now show that the domain SFL can be simplified by applying the same 
notion of redundancy as identified in ( [Bagnara et al. 2002| ). Namely, in the defini- 
tion of SFL it is possible to replace the set-sharing component SH by PSD without 
affecting the precision on groundness, independence, freeness and linearity. In order 
to prove such a claim, we now formalize the new observable properties. 

Definition 43 

(The observables of SFL.) The (overloaded) groundness and independence ob- 
servables PcomRps G VlCo{SFL) are defined, for each {sh,f,l) £ SFL, by 



the freeness and linearity observables Pf,Pl €E nco{SFL) are defined, for each 



The overloading of ppso working on the domain SFL is the straightforward exten- 
sion of the corresponding operator on SH: in particular, the freeness and linearity 
components are left untouched. 

Definition 44 

(Non-redundant SFL.) For each (s/i, /, I) e SFL, the operator ppsn G nco{SFL) 
is defined by 



As proved in IjZaffanella et al. 2002(1 . we have that ppsn E {pcon n pps); by the above 
definitions, it is also clear that ppso E (Pf^Pl)', thus, ppsn is more precise than 
the reduced product {pcon n pps n n p^,). Informally, this means that the domain 
SFL2 is able to represent all of our observable properties without precision losses. 

The next theorem shows that ppso is a congruence with respect to the aunifyg, 
alubs and aexists^ operators. This means that the domain SFL2 is able to propa- 
gate the information on the observables as precisely as SFL, therefore providing a 
completeness result. 

Theorem 45 

Let di, d2 G SFL be such that ppsoidi) — ppsD{d2)- Then, for each sequence of 
bindings hs £ Bind* , for each d' G SFL and V G p{ VI), 



Pc.„{{sh,f,l)) {pcUsh),0,0), 

pps{{sh,f,l)) {pps{sh),0,0); 



{sh,f,l) G SFL, by 



p,{{sh,f,l)) {SG,0,l). 



PpsD{{sh, f,l)) =^ (ppsoish), f,l). 
This operator induces the lattice SFL2 =^ Ppsd{SFL). 
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PpsD(alubs(rfi, d')) = pp,,D(alubs(rf2, d')), 

Ppsd 

(aexistSs(di, y)) = ppso (aexistSs(rf2, V^)) ■ 

Finally, by providing the minimality result, we show that the domain SFL2 is 
indeed the generalized quotient ( Corte si et al. 19 98 : Giacoba zzi et al. 1998|l of SFL 
with respect to the reduced product (pcon n pp^ r\ p^n pi^). 

Theorem ^6 

For each i e {1,2}, let di = {shi,fi,li) G SFL be such that ppsn{di) ^ Ppso^d-i). 
Then there exist a sequence of bindings 6s G Bind* and an observable property 
P e {pcw, Pps, Pf,Pl} such that 

p(aunify5((ii, &s)) /9(aunifys((i2, ^s))- 

As far as the implementation is concerned, the results proved in ( |Bagnara et al. 2002| ) 
for the domain PSD can also be applied to SFL2- In particular, in the definition 
of amgUg every occurrence of the star-union operator can be safely replaced by the 
self-bin-union operator. As a consequence, it is possible to provide an implementa- 
tion where the time complexity of the amgu^ operator is bounded by a polynomial 
in the number of sharing groups of the set-sharing component. 

The following result provides another optimization that can be applied when 
both terms x and t are definitely linear, but none of them is definitely free (i.e., 
when we compute sh" by the second case stated in Definition I28|l . 

Theorem 47 

Let sh G SH and {x t) £ Bind, where {x} U vars(t) C VI. Let s/i_ = rel({a::} U 

vars(i), s/i), shx rel({x}, sh), sht '= rel(vars(i), sh), sh^t '^^ sh^ fl sht, shy/ =^ 
rel(VF, sh), where W — vars(t) \ {x}, and 

s/i* bin(s/ia; U hin{shx, sh*^), sht U bin(s/it, sh*^)). 

Then it holds 

PpsD{sh- U bin(s/ix, sht)), \i x ^ vars(t); 
Ppsd {sh_ U bin(s/i^, shyy )) , otherwise. 



Ppsd 

(cyclic^ (s/i_ Us/i*)) 



Therefore, even when terms x and t possibly share (i.e., when sh^t 7^ 0), by using 
SFL2 we can avoid the expensive computation of at least one of the two inner binary 
unions in the expression for s/i*. 



6 Experimental Evaluation 

Example 131)1 shows that an analysis based on the new abstract unification operator 
can be strictly more precise than one based on the classical proposal. However, 
that example is artificial and leaves open the question as to whether or not such 
a phenomenon actually happens during the analysis of real programs and, if so, 
how often. This was the motivation for the experimental evaluation we describe in 
this section. We consider the abstract domain Pos x SFL2 ( [Bagnara et al. 20011 ), 
where the non-redundant version SFL2 of the domain SFL is further combined, 
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as described in ([ Bagnara et al. 2001} Section 4), with the definite groundness in- 
formation computed by Pos and compare the results using the (classical) abstract 
unification operator of (Bagn aTa et al. 2001| Definition 4) with the (new) operator 
amgUg given m Definition OH Taking this as a starting point, we experimentally 
evaluate eight variants of the analysis arising from all possible combinations of the 
following options: 

1. the analysis can be goal independent or goal dependent; 

2. the set-sharing component may or may not have widening enabled IjZaffanella et al. 1999|l : 

3. the abstract domain may or may not be upgraded with structural information 
using the Pattern(-) operator (see ( [Bagnara et al. 20001 ) and ( jBagnara et al. 20'0T1 
Section 5)). 

The experiments have been conducted using the China analyzer ( [Bagnara T997| ) 
on a GNU/Linux PC system. China is a data-flow analyzer for (constraint) logic 
programs performing bottom-up analysis and deriving information on both call- 
patterns and success-patterns by means of program transformations and optimized 
fixpoint computation techniques. An abstract description is computed for the call- 
and success-patterns for each predicate defined in the program. The benchmark 
suite, which is composed of 372 logic programs of various sizes and complexity, can 
be considered representative. 

The precision results for the goal independent comparisons are summarized in 
Table ^ For each benchmark, precision is measured by counting the number of 
independent pairs as well as the numbers of definitely ground, free and linear vari- 
ables detected. For each variant of the analysis, these numbers are then compared 
by computing the relative precision improvements and expressing them using per- 
centages. The benchmark suite is then partitioned into several precision equivalence 
classes and the cardinalities of these classes are shown in Tabled For example, when 
considering a goal independent analysis without structural information and without 
widenings, the value 5 found at the intersection of the row labeled '0 < p < 2' with 
the column labeled T should be read: "for five benchmarks there has been a (pos- 
itive) increase in the number of independent pairs of variables which is less than 
or equal to two percent." Note that we only report on independence and linearity 
(in the columns labeled T' and 'L', respectively), because no differences have been 
observed for groundness and freeness. The precision class labeled 'unknown' identi- 
fies those benchmarks for which the analyses timed-out (the time-out threshold was 
fixed at 600 seconds). Hence, for goal independent analyses, a precision improve- 
ment affects from 1.6% to 3% of the benchmarks, depending on the considered 
variant. 

When considering the goal dependent analyses, we obtain a single, small improve- 
ment, so that no comparison tables are included here: the improvement, affecting 
linearity information, can be observed when the abstract domain includes structural 
information. 

With respect to differences in the efficiency, the introduction of the new abstract 
unification operator has no significant effect on the computation time: small differ- 
ences (usually improvements) are observed on as many as 6% of the benchmarks for 
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Goal 
Independent 


Without Widening 


With Widening 


w/o SI 


with SI 


w/o SI 


with SI 


Prec. class 


I 


L 


I 


L 


I 


L 


I 


L 


5 < p < 10 




2 




2 




2 




2 


2 <p < 5 
















1 


<p < 2 


5 


5 


9 


6 


6 


6 


12 


8 


same precision 


357 


355 


337 


338 


366 


364 


360 


361 


unknown 


10 


10 


26 


26 











Table 1. Classical Pos x SFL2 versus enhanced one: precision. 

the goal independent analysis without structural information and without widen- 
ings; other combinations register even less differences. 

We note that it is not surprising that the precision and efficiency improvements 
occur very rarely since the abstract unification operators behave the same except 
under very specific conditions: the two terms being unified must not only be defi- 
nitely linear, but also possibly non-free and share a variable. 

7 Related Work 

Sharing information has been shown to be important for finite-tree analysis ( |Bagnara et al. 2001| 
IBagnara et al. 2001| ). This aims at identifying those program variables that, at a 
particular program point, cannot be bound to an infinite rational tree (in other 
words, they are necessarily bound to acyclic terms) . This novel analysis is irrelevant 
for those logic languages computing over a domain of finite trees, while having sev- 
eral applications for those (constraint) logic languages that are explicitly designed to 
compute over a domain including rational trees, such as Prolog II and its successors 

(iColmcraucr 1982; Colmcraucr 1990), SICStus Prolog ( [Swedish Institute of Computer Science, Programming Systc 
and Oz IjSmolka and Treinen 1 994 ) . The analysis specified in ( |Bagnara et al. 2001| ) 
is based on a parametric abstract domain H x P, where the H component (the 
Herbrand component) is a set of variables that are known to be bound to finite 
terms, while the parametric component P can be any domain capturing aliasing, 
groundness, freeness and linearity information that is useful to compute finite-tree 
information. An obvious choice for such a parameter is the domain combination 
SFL. It is worth noting that, in ( [Bagnara et al. 2001| ), the correctness of the finite- 
tree analysis is proved by assuming the correctness of the underlying analysis on 
the parameter P. Thus, thanks to the results shown in this paper, the proof for the 
domain H x SFL can now be considered complete. 

Codish et al. l|Codish et al. 200(Hl describe an algebraic approach to the sharing 
analysis of logic programs that is based on set logic programs. A set logic program 
is a logic program in which the terms are sets of variables and standard unification 
is replaced by a suitable unification for sets, called ACIl-unification (unification 
in the presence of an associative, commutative, and idempotent equality theory 
with a unit element). The authors show that the domain of set-substitutions, with 
a few modifications, can be used as an abstract domain for sharing analysis. They 
also provide an isomorphism between this domain and the set-sharing domain SH 
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of Jacobs and Langen. The approach using set logic programs is also generalized 
to include linearity information, by suitably annotating the set-substitutions, and 
the authors formally state the optimality of the corresponding abstract unifica- 
tion operator lin-mguj^Qjj (Lemma A. 10 in the Appendix of l|Codish et al. 2000)l ). 
However, this operator is very similar to the classical combinations of set-sharing 
with linearity ( |Bruynooghe et al. 1994a] IHans and Winkler 19921 [Langen 19 90 ) : in 
particular, the precision improvements arising from this enhancement are only ex- 
ploited when the two terms being unified are definitely independent. As we have seen 
in this paper, such a choice results in a sub-optimal abstract unification operator, 
so that the optimality result cannot hold. By looking at the proof of Lemma A. 10 
in IjCodish et al. 2000jl . it can be seen that the case when the two terms possi- 
bly share a variable is dealt with by referring to an example:^ this one is sup- 
posed to show that all the possible sharing groups can be generated. However, 
even our improved operator correctly characterizes the given example, so that the 
proof is wrong. It should be stressed that the amgu^ operator presented in this 
paper, though remarkably precise, is not meant to subsume all of the proposals 
for an improved sharing analysis that appeared in the recent literature (for a thor- 
ough experimental evaluation of many of these proposals, the reader is referred 
to | |Bagnara et al. 2000| IZaffanella 2001|l ). In particular, it is not difficult to show 
that our operator is not the optimal approximation of concrete unification. 

In a very recent paper ( |Howe and King 2003| ), J. Howe and A. King consider the 
domain SFL and propose three optimizations to improve both the precision and the 
efficiency of the (classical) abstract unification operator. The first optimization is 
based on the same observation we have made in this paper, namely that the inde- 
pendence check between the two terms being unified is not necessary for ensuring 
the correctness of the analysis. However, the proposed enhancement does not fully 
exploit this observation, so that the resulting operator is strictly less precise than 
our amgUg operator (even when the operator cyclic^ does not come into play). In 
fact, the first optimization of ( |Howe and King 2003| ) is not uniformly more precise 
than the classical proposals. The following example illustrates this point. 

Example 48 

Let VI = {x, y, zi, Z2, za}, (a; y) G Bind and d {sh,0, VI), where sh = 
{xzi,xz2,xz3,yzi,yz2,yz3}. 

Since x and y are linear and independent, amgUg as well as all the classical 
abstract unification operators will compute di = (s/ii, 0, {x, y}), where 

shi hm{shx, shy) = {xyzi,xyziZ2,xyziZ3,xyz2,xyz2Z3,xyz3}. 

In contrast, a computation based on ( |Howe and King 2003| Definition 3.2), results 
in the less precise abstract element d2 = (s/i2, 0, {x, y}'), where 

s/i2 '= bin(s/i*, shy) n bin(s/ij;, sh*) = shi U {xyziZ2Z^}. 

^ The proof refers to Example 8, which however has nothing to do with the possibihty that the 
two terms share; we believe that Example 2 was intended. 
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The second optimization shown in ( |Howe and King 2003| ) is based on the en- 
hanced combination of set-sharing and freeness information, which was originally 
proposed in l|File 1994|l . In particular, the authors propose a slightly different pre- 
cision enhancement, less powerful as far as precision is concerned, which however 
seems to be amenable for an efficient implementation. The third optimization 
in ) |Howe and King 2003| ) exploits the combination of the domain SFL with the 
groundness domain Pos. 

8 Conclusion 

In this paper we have introduced the abstract domain SFL, combining the set- 
sharing domain SH with freeness and linearity information. While the carrier of SFL 
can be considered standard, we have provided the specification of a new abstract 
unification operator, showing examples where this operator achieves more precision 
than the classical proposals. The main contributions of this paper are the following: 

• we have defined a precise abstraction function, mapping arbitrary substitu- 
tions in rational solved form into their most precise approximation on SFL; 

• using this abstraction function, we have provided the mandatory proof of 
correctness for the new abstract unification operator, for both finite-tree and 
rational-tree languages; 

• we have formally shown that the domain SFL is uniformly more precise than 
the domain ASub; we have also provided an example showing that all the clas- 
sical approaches to the combinations of set-sharing with freeness and linearity 
fail to satisfy this property; 

• we have shown that, in the definition of SFL, we can replace the set-sharing 
domain SH by its non- redundant version PSD. As a consequence, it is possible 
to implement an algorithm for abstract unification running in polynomial time 
and still obtain the same precision on all the considered observables, that is 
groundness, independence, freeness and linearity. 
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